Bio-Inspired Techniques in the Clustering of Texts: Synthesis and Comparative Study
نویسندگان
چکیده
Today, the development of a large scale access network internet/intranet has increased the amount of textual information available online/offline, where billions of documents have been created. In the last few years, bio inspired techniques which invaded the world of text-mining such, as clustering, represents a critical problem in the digital society especially over the world of information retrieval (IR). The content of this article is a recapitulation of a set of works as a comparative study between the authors’ experiments realized by applying a set of bio-inspired techniques (social spiders(SS), 2D Cellular automata (2D-CA), 3D cellular automata (3D-CA), Artificial immune system (AIS), Particle swarm optimization (PSO)) and other techniques founded in literature (Ants Colony Optimization (ACO) and Genetic algorithms (GAs)) for solving the text clustering challenge by using the benchmark Reuter 21785. They analyse the different results in term of entropy, f-measure, execution time, and clusters number in order to find the ideal configuration (similarity measure and text representation method) for each technique. Their objectives are to improve the efficiency of text clustering systems and make decisions that can be the starting point for other researchers. Bio-Inspired Techniques in the Clustering of Texts: Synthesis and Comparative Study
منابع مشابه
Hybrid Bio-Inspired Clustering Algorithm for Energy Efficient Wireless Sensor Networks
In order to achieve the sensing, communication and processing tasks of Wireless Sensor Networks, an energy-efficient routing protocol is required to manage the dissipated energy of the network and to minimalize the traffic and the overhead during the data transmission stages. Clustering is the most common technique to balance energy consumption amongst all sensor nodes throughout the network. I...
متن کاملExtraction of Respiratory Signal Based on Image Clustering and Intensity Parameters at Radiotherapy with External Beam: A Comparative Study
Background: Since tumors located in thorax region of body mainly move due to respiration, in the modern radiotherapy, there have been many attempts such as; external markers, strain gage and spirometer represent for monitoring patients’ breathing signal. With the advent of fluoroscopy technique, indirect methods were proposed as an alternative approach to extract patients’ breathing signals...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملOptimal Idle Speed Control of a Natural Aspirated Gasoline Engine Using Bio-inspired Meta- heuristic Algorithms
In order to lowering level of emissions of internal combustion engines (ICEs), they should be optimally controlled. However, ICEs operate under numerous operating conditions, which in turn makes it difficult to design controller for such nonlinear systems. In this article, a generalized unique controller for idle speed control under whole loading conditions is designed. In the current study, in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. J. of Applied Metaheuristic Computing
دوره 6 شماره
صفحات -
تاریخ انتشار 2015